Using Linguistic Annotations in Statistical Machine Translation of Film Subtitles
نویسندگان
چکیده
Statistical Machine Translation (SMT) has been successfully employed to support translation of film subtitles. We explore the integration of Constraint Grammar corpus annotations into a Swedish–Danish subtitle SMT system in the framework of factored SMT. While the usefulness of the annotations is limited with large amounts of parallel data, we show that linguistic annotations can increase the gains in translation quality when monolingual data in the target language is added to an SMT system based on a small parallel corpus.
منابع مشابه
Machine Translation of Film Subtitles from English to Spanish Combining a Statistical System with Rule - based Grammar
In this project we combined a statistical machine translation system for the translation of film subtitles from English to Spanish with rule-based grammar checking. At first we trained the best possible statistical machine translation system with the available training data. The largest part of the training corpus consists of freely available amateur subtitles. A smaller part are professionally...
متن کاملThe Representation of Non-Linguistic Sounds in Persian and English Subtitles for the Deaf and Hard-of-Hearing: A Comparative Study
Subtitling for the deaf and hard-of-hearing (SDH) is an area which deserves a special attention as it ena- bles these people to access to the part of the ‘world’ intended for hearing people, including the world of ‘motion pictures’, and particularly movie sounds. Compared to linguistic sounds, non-linguistic sounds have received little attention in the field of translation, although they are in...
متن کاملThe Automatic Translation of Film Subtitles. A Machine Translation Success Story?
Every so often one hears the complaint that 50 years of research in Machine Translation (MT) has not resulted in much progress, and that current MT systems are still unsatisfactory. A closer look reveals that web-based general-purpose MT systems are used by thousands of users every day. And, on the other hand, special-purpose MT systems have been in long-standing use and work successfully in pa...
متن کاملStrategies Used in the Translation of Interlingual Subtitling
This study was an attempt to identify the interlingual strategies employed to translate English subtitles into Persian and to determine their frequency, as well. Contrary to many countries, subtitling is a new field in Iran. The study, a corpus-based, comparative, descriptive, non-judgmental analysis of an English-Persian parallel corpus, comprised English audio scripts of five movies of differ...
متن کاملRule extraction for multi bottom-up tree transducers
Following the invention of computers, it was always a dream to obtain translations automatically. If we give a machine a sentence it should return a sentence in another language expressing the same meaning. In the subfield of statistical machine translation (SMT), this translation is achieved with the help of statistical models. Those models use large text collections to automatically learn bas...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009